.. _tutorial_9:
.. TU:
@ Check GCMC tutorial has molecules instead of atoms as the key unit
@ Eliminate the tabs in the CONTROL file
@ Better signposting.
@ Needs a look over
@ Add references, future work; mention block averaging as a way to make a plot of the phase diagram.
=======================================================
TUTORIAL 9 : Gibbs ensemble Monte Carlo
=======================================================
:Author:
Tom L. Underwood, t.l.underwood@bath.ac.uk
Introduction
============
The liquidvapour coexistence curve (i.e. the temperatures and pressures corresponding to the boiling phase transition)
is one of the key features of a substance's phase diagram. An important application of Monte Carlo simulation is to
calculate this curve for a given fluid. Monte Carlo is regarded as the method par excellence for such calculations.
The reason for this is that in Monte Carlo one can add and remove particles from the system. It turns out that this
enables liquidvapour coexistence to be probed extremely efficiently, more efficiently than if one performed a simulation
with a fixed number of particles.
Gibbs ensemble Monte Carlo (GEMC) is a popular method for calculating liquidvapour coexistence curves. In this
tutorial you will use GEMC in DL_MONTE to calculate the liquidvapour coexistence curve for the LennardJones fluid,
in which the particles interact via the LennardJones potential:
.. math::
\phi(r) = 4\epsilon\biggl[\Bigl(\frac{\sigma}{r}\Bigr)^{12}\Bigl(\frac{\sigma}{r}\Bigr)^{6}\biggr],
where we truncate the interactions at a distance :math:`r_c=2.5\sigma`.
To elaborate, at various temperatures, you will calculate the gas and liquid densities corresponding to coexistence using
GEMC.
Prerequisites

This tutorial assumes that the reader is familiar with the basics of DL_MONTE, i.e. can perform simple NVT, NPT and GCMC
simulations: see :ref:`tutorial_1`, :ref:`tutorial_2` and :ref:`tutorial_4`.
Solutions

Note that example output and input files for all simulations to be performed in this tutorial can be found in the *solutions*
subdirectory within the *tutorial9* directory in the archive containing the tutorial files.
Background and methodology
==========================
In most simulation techniques the system to be simulated is comprised of one 'box', which is subject to periodic boundary
conditions and which contains particles whose positions are evolved somehow during the course of the simulation. GEMC
is somewhat unusual in that the system is comprised of *two* simulation boxes. Particles in one box do not interact with
particles in the other box. Moreover, in GEMC the (twobox) system is evolved using the following Monte Carlo moves (where
we are assuming that the particles are molecules with rotational degrees of freedom):
1. Translate a single molecule: a *translation move*.
2. Rotate a single molecule: a *rotation move*.
3. Transfer a molecule from one box to a random position in the other box: we refer to this here as a *transfer* move.
4. Change the volume of one box by an amount :math:`\Delta V`, determined stochastically, while simultaneously changing the
the other box by an amount :math:`\Delta V` so that the total volume of the twobox system is preserved in the move. We
will refer to this as a *volume move* here. (Note however that this volume move is not the same as the volume move used
in NPT simulations the Gibbs volume move affects two boxes).
Note that moves 1 and 2 above are also employed in NVT and NPT simulations. Moves 3 and 4 are, however, particular to GEMC.
For brevity we do not give the formulae for the acceptance probabilities of the above moves here; such details, as well
as further information describing GEMC is provided at the end of this tutorial.
Moreover, while we mentioned above that the total volume of the two boxes is conserved in GEMC, it is worth emphasising that the total
number of molecules in the system is also conserved, though the number in *each* box can vary during the simulation as a
result of molecules being transferred between the two boxes.
Consider how the phase of a singlecomponent fluid changes with density :math:`\rho` at a fixed temperature :math:`T`. Clearly at low
densities the fluid will be in the gas phase, and at high densities the fluid will be in the liquid phase. At intermediate densities,
however, there is a *twophase region* demarcated by the gas and liquid densities corresponding to coexistence, which we denote as
:math:`rho_G` and :math:`\rho_L`, respecively. The twophase region is illustrated in the figure below. Note that it only exists below
the critical temperature.
.. figure:: images/tutorial9phasediagram.png
:width: 600px
For :math:`rho` and :math:`T` within the twophase region the equilibrium state of the fluid is a mixed system, comprised of gas
and liquid phases which coexist, where densities of the gas and liquid are given by the :math:`rho_G` and :math:`\rho_L` appropriate
to :math:`T` under consideration.
The elegance of GEMC is that if the :math:`T` and initial density :math:`\rho` of the system (i.e. the *total*
number of molecules in the system, over *both* boxes, divided by the total volume of the system) reside within the twophase
region, then the system will 'find' this twophase equilibrium. It does this by exchanging molecules and volume between the two boxes
until one box becomes a gas with :math:`rho_G` and the other box corresponds to the liquid with :math:`\rho_L`.
It finds this
equilibrium relatively quickly because, unlike other methods, in the twobox system there is never a liquidgas interface.
Once the twophase equilibrium has been found :math:`rho_G` and :math:`\rho_L` can be obtained from the simulation results by
simply measuring the densities of the gas and liquid boxes.
We will now demonstrate this using DL_MONTE.
Your first GEMC simulation
==========================
We begin with a GEMC simulation at :math:`T^*\equiv k_BT/\epsilon = 1.0` and :math:`\rho=0.3`, which happens to be in the middle
of the twophase region of the phase diagram for the LJ fluid.
FIELD

We begin by inspecting the FIELD file (note that the line numbers at the start of each line are not actually present in the file):
.. codeblock:: html
:linenos:
LennardJones, 2.5*sigma cutoff, sigma=1A, epsilon = k; 2 CONFIGS for Gibbs ensemble
CUTOFF 2.5
UNITS k
NCONFIGS 2
ATOMS 1
LJ core 1.0 0.0
MOLTYPES 1
lj
ATOMS 1 1
LJ core 0.0 0.0 0.0
FINISH
VDW 1
LJ core LJ core lj 1.0 1.0
CLOSE
This file is similar to the one you encountered in the grandcanonical Monte Carlo tutorial. Crucially though, *NCONFIGS* is
set to 2 instead of 1, since GEMC uses two simulation boxes. Thus DL_MONTE will expect to find two configurations in the CONFIG file.
CONFIG

Inspecting the CONFIG file, we indeed find two configurations:
.. codeblock:: html
:linenos:
LennardJones starting; particles are molecules, not atoms
0 0
7.93700526 0.0000000 0.0000000
0.0000000 7.93700526 0.0000000
0.0000000 0.0000000 7.93700526
NUMMOL 150 1000
MOLECULE lj 1 1
LJ c
0.139682 0.116455 0.0535593
MOLECULE lj 1 1
LJ c
0.457034 0.421912 0.313942
MOLECULE lj 1 1
LJ c
0.269838 0.226045 0.357893
MOLECULE lj 1 1
LJ c
0.417234 0.0306274 0.173153
...
...
...
MOLECULE lj 1 1
LJ c
0.32143 0.170441 0.344404
MOLECULE lj 1 1
LJ c
0.258839 0.218443 0.290028
MOLECULE lj 1 1
LJ c
0.481594 0.0323228 0.447171
LennardJones starting; particles are molecules, not atoms
0 0
7.93700526 0.0000000 0.0000000
0.0000000 7.93700526 0.0000000
0.0000000 0.0000000 7.93700526
NUMMOL 150 1000
MOLECULE lj 1 1
LJ c
0.139682 0.116455 0.0535593
MOLECULE lj 1 1
LJ c
0.457034 0.421912 0.313942
MOLECULE lj 1 1
LJ c
0.269838 0.226045 0.357893
...
Note that in this case both configurations are identical. This is the case for the sake of convenience: when setting up the
initial configurations of the two boxes, it is easier to make them the same, and both corresponding to the density of interest.
One could, however, start with boxes with different configurations.
Note also that the second number on the lines immediately below the configuration title lines (containing 'LennardJones starting;
particles...') is 0. This signifies that the positions of atoms in the following configurations are given in *fractional*
coordinates in the CONFIG file, not Cartesian coordinates as would normally be the case. We do this here for convenience;
later we will perform GEMC simulations using different initial densities, and the use of fractional coordinates will make it
easier to generate the required CONFIG files.
CONTROL

Finally, we come to the CONTROL file:
.. codeblock:: html
:linenos:
Gibbs simulation LennardJones fluid # Comment line
use ortho # Flag to use 'fast' functionality for orthorhombic boxes
finish
ranseed # Use a random seed
temperature 1.0
steps 1800000 # Simulation length
sample coord 300000 # Frequency to store configurations during simulation
stat 300000 # Frequency to print to PTFILE
print 300000 # Frequency to print to OUTPUT
yamldata 1000 # Frequency to output important data to YAMLDATA files
noewald all
move molecule 1 100 # Translate molecules, followed by frequency (100)
lj
move gibbstransfmol 1 100
lj
move volume gibbs 1
start
The key features of this file are the *move gibbstransfmol* and *move volume gibbs* directives, which along
with the use of two simulation boxes, make the DL_MONTE simulation a GEMC simulation. *move gibbstransfmol*
invokes transfer moves betwen the two boxes. Here we apply the moves to the 'lj' molecular species, which is
why 'lj' is specified in the line following *move gibbstransfmol* above. *move volume gibbs* invokes Gibbs
volume moves where volume is transfered between the two boxes, but the total volume of the whole system is
conserved. As with the other *move* directives, the frequencies of each move is specified; in the above
CONTROL file the ratio of translation moves to transfer moves to volume moves is 100:100:1.
*Exercise*

Create a directory called *density_0.3*, and copy the CONTROL, CONFIG and FIELD files into that directory.
Then perform a DL_MONTE simulation in the directory. The simulation should not take long to complete: 1.53
minutes.
Output files

GEMC simulations in DL_MONTE create two versions of many output files, one for each simulation box; see the DL_MONTE
manual for details. Of importance here are the YAMLDATA files: a GEMC simulation will produce two YAMLDATA files:
YAMLDATA1.000 pertains to simulation box 1, while YAMLDATA2.000 pertains to simulation box 2. Here are the first
few lines of YAMLDATA2.000 output by an instance of the above simulation:
.. codeblock:: html
:linenos:
temperature: 1.00000000000000E+00
gibbsmols:


timestamp: 1000
energy: 1.11660465569217E+02
energyvdw: 1.11660465569217E+02
volume: 4.99999778765834E+02
nmol: [ 146 ]

timestamp: 2000
energy: 2.54952759149814E+02
energyvdw: 2.54952759149814E+02
volume: 4.99998945385850E+02
nmol: [ 143 ]

timestamp: 3000
energy: 2.40232294084996E+02
energyvdw: 2.40232294084996E+02
volume: 4.99997535232507E+02
nmol: [ 132 ]

timestamp: 4000
energy: 2.23217638349257E+02
energyvdw: 2.23217638349257E+02
volume: 4.99997053420530E+02
nmol: [ 128 ]
The meaning of the data should be obvious. Recall that we are interested in using GEMC to determine the
liquid and gas densities corresponding to coexistence at this temperature, and that, if the simulation
was successful, then the equilibrated density of one box should correspond to a liquid and the equilibrated
density of the other box should correspond to a gas. The density at a give timestep in the simulation can
be obtained from the number of molecules in the box and the box's volume at that timestep. This information
can be obtained from YAMLDATA1.000: see the *nmol:* and *volume:* tags in the example above.
*Exercise*

Included with the tutorial files is a script, *yaml_to_density_gibbs.sh* which can be used to extract a time
series of densities from a YAMLDATA1.000 or YAMLDATA2.000 file obtained from a DL_MONTE GEMC simulation.
Apply the following commands to create two files, *density_1.dat* and *density_2.dat* which contain such time
series for both boxes.
::
[tutorial_9]$ yaml_to_rho_gibbs.sh YAMLDATA1.000 > density_1.dat
[tutorial_9]$ yaml_to_rho_gibbs.sh YAMLDATA2.000 > density_2.dat
Then plot the files. You should see that one box equilibrates at the gas density (about 0.05) and the other
at the liquid density (about 0.65).
Example plots of *density_1.dat* and *density_2.dat* are given below.
.. figure:: images/tutorial9density0.3.png
:width: 600px
In this case box 1 becomes the liquid and box 2 becomes the gas. Note that there is an equilibration time
at the beginning of the ssimulation, during which the two boxes 'find' the twophase equilibrium of the system.
Exploring different initial conditions
======================================
Recall that the atomic coordinates in the CONFIG files are fractional. Recall also that the initial
density of the system is 0.3: as can be seen from inspection of the CONFIG file the configuration for each
box consists of 150 molecules in a cubic cell with dimension 7.93700526, and hence the density is
:math:`150/7.93700526^3=0.3`. Now, one could change the initial density of the system by altering the
dimensions of both cells in CONFIG; denoting the cell dimension by :math:`L`, the density is given by
:math:`\rho=150/L^3`.
*Exercise*

With this in mind, create a new directory *density_0.2*, and populate it with the
CONTROL, CONFIG and FIELD files used for the simulation described above. Then modify the cell dimensions in the
CONFIG file so that it corresponds to an initial system density of 0.2. Then run a DL_MONTE simulation in that directory.
Then create *density_1.dat* and *density_2.dat* data files as described above, and plot them.
Example plots of *density_1.dat* and *density_2.dat* at a density of 0.2 are given below.
.. figure:: images/tutorial9density0.2.png
:width: 600px
Note that the system finds the same twophase equilibrium: a gas and liquid with densities of :math:`\approx 0.05`
and 0.65, respectively. However, you should also see that the equilibration period is longer than for :math:`\rho=0.3`.
This reflects the fact that the further the initial system density is from the centre of the twophase region, the
longer the GEMC simulation will take to find equilibrium  for this system :`\rho=0.3` is closer to the centre of
the twophase region than :math:`\rho=0.2`.
To cement this point, perform another simulation at :math:`\rho=0.15`. You should see that the equilibration time
is even longer than for :math:`\rho=0.2`.
Finally, perform a simulation at :math:`\rho=0.01`. You should see that both boxes remain rigidly at their initial
densities. Example plots are shown below.
.. figure:: images/tutorial9density0.01.png
:width: 600px
There are two reasons why this might be the case. One is that the equilibration time is so long that we
have yet to observe a discernable change in the boxes towards the twophase equilibrium. The other possibility is
that there is no twophase equilibrium at this density, i.e. the density resides within the onephase region of the
phase diagram. In this case the two boxes will both equilibrate in the equilibrium phase, and thus will both have
the same density. In fact :math:`\rho=0.01` resides within the onephase region of the phase diagram for this model,
explaining our observations.
Calculating the phase diagram
=============================
We have learned above, for a given temperature, if the GEMC simulation is set up with an initial system density
close to the centre of the twophase region, then equilibration to the twophase equilibrium will be relatively
quick. Once the simulation has found equilibrium, the gas and liquid densities corresponding to coexistence, i.e. :math:`\rho_G`
and :math:`\rho_L`, can be obtained via simple analysis of the simulation results: :math:`\rho_G` is the
equilibrium density of the box with the lower density, and :math:`\rho_L` is the equilibrium density of the
box with the higher density.
For this system, :math:`\rho=0.3` has proved to be close to the centre of the twophase region at :math:`T=1.0`.
We therefore expect it would be a good initial density to use in GEMC simulations to determine :math:`\rho_G` and
:math:`\rho_L` at other temperatures. If, with :math:`\rho=0.3`, we find that equilibration is slow, then we can
always tweak the density to find a better one.
*Exercise*

Using what you have learned above, use GEMC to come up with values for :math:`\rho_G` and :math:`\rho_L` at the following
temperatures: :math:`T^*=0.8`, 0.9, 1.0, 1.1, 1.15 and 1.2. Recall that :math:`T^*\equiv k_BT/\epsilon`. Hence :math:`T^*`, which
in the above simulations took the value 1, can be changed by altering :math:`\epsilon` in the FIELD file. Noting that
the unit of energy in FIELD is :math:`k_BT` (via the *k* option for units), the LennardJones :math:`\epsilon` in FIELD
should be set to :math:`1/T^*` to realise a reduced temperature of :math:`T^*`.
You should find that the gap between :math:`\rho_G` and :math:`\rho_L` narrows as the temperature is increased.
This is expected: as can be seen from the schematic phase diagram earlier in this tutorial, the difference between :math:`\rho_G`
and :math:`\rho_L` decreases as temperature is increased, and tends to zero at the critical temperature.
You should see that something interesting happens at :math:`T^*=1.2`. This is very close to the critical temperature.
You should see that there are massive fluctuations in the density of both boxes, fluctuations which take the system from
liquid to gas and vice versa. In fact, when one box is a liquid the other is a gas, and vice versa. Such fluctuations are
characteristic of fluids near the critical point, and complicate analysis of the simulation data. However, methods for
analysing data near the critical point do exist, though this is beyond the scope of the current tutorial.
An example of what you might see in your simulations is given below.
.. figure:: images/tutorial9manytemps.png
:width: 600px
Further information
===================
* An instructive pedagogical article discussing GEMC for the liquidgas problem can be found `here `_.
* The `Sklog wiki page for GEMC `_ has links to the original
references for GEMC, as well as some other interesting links.
* `This paper `_ contains plots of the phase diagram for the model considered here: see Fig. 2, dotted
curve.